Search CORE

13 research outputs found

Performance analysis of single slope solar still using sensible heat storage material

Author: Aravindh V
Baburangan J
Koushik S
Mahendran P
R BALAJI
Publication venue: Applied Innovative Research (AIR)
Publication date: 18/09/2019
Field of study

Direct sunlight has been utilized long back for distillation of water. For supplying desalinated water to small communities nearby coastal remote areas solar distillation plants are used. Solar stills are easy to construct, can be done by rural people from locally available materials, simple in operation by unskilled personnel, no hard maintenance is required and almost no operation cost. In order to increase the efficiency of a solar still sensible heat storage materials such as marbles, pebbles, blue metal stone, basalt stone etc. We have used to improve the efficiency of solar still. While using the sensible heat storage material distillation process will continue in both day and night

Online Publishing @ NISCAIR

Invariant Slot Attention: Object Discovery with Slot-Centric Reference Frames

Author: Biza Ondrej
Elsayed Gamaleldin F.
Kipf Thomas
Mahendran Aravindh
Sajjadi Mehdi S. M.
van Steenkiste Sjoerd
Publication venue
Publication date: 20/07/2023
Field of study

Automatically discovering composable abstractions from raw perceptual data is a long-standing challenge in machine learning. Recent slot-based neural networks that learn about objects in a self-supervised manner have made exciting progress in this direction. However, they typically fall short at adequately capturing spatial symmetries present in the visual world, which leads to sample inefficiency, such as when entangling object appearance and pose. In this paper, we present a simple yet highly effective method for incorporating spatial symmetries via slot-centric reference frames. We incorporate equivariance to per-object pose transformations into the attention and generation mechanism of Slot Attention by translating, scaling, and rotating position encodings. These changes result in little computational overhead, are easy to implement, and can result in large gains in terms of data efficiency and overall improvements to object discovery. We evaluate our method on a wide range of synthetic object discovery benchmarks namely CLEVR, Tetrominoes, CLEVRTex, Objects Room and MultiShapeNet, and show promising improvements on the challenging real-world Waymo Open dataset.Comment: Accepted at ICML 2023. Project page: https://invariantsa.github.io

arXiv.org e-Print Archive

RUST: Latent Neural Scene Representations from Unposed Imagery

Author: Duckworth Daniel
Greff Klaus
Kipf Thomas
Lucic Mario
Mahendran Aravindh
Pot Etienne
Sajjadi Mehdi S. M.
Publication venue
Publication date: 24/03/2023
Field of study

Inferring the structure of 3D scenes from 2D observations is a fundamental challenge in computer vision. Recently popularized approaches based on neural scene representations have achieved tremendous impact and have been applied across a variety of applications. One of the major remaining challenges in this space is training a single model which can provide latent representations which effectively generalize beyond a single scene. Scene Representation Transformer (SRT) has shown promise in this direction, but scaling it to a larger set of diverse scenes is challenging and necessitates accurately posed ground truth data. To address this problem, we propose RUST (Really Unposed Scene representation Transformer), a pose-free approach to novel view synthesis trained on RGB images alone. Our main insight is that one can train a Pose Encoder that peeks at the target image and learns a latent pose embedding which is used by the decoder for view synthesis. We perform an empirical investigation into the learned latent pose structure and show that it allows meaningful test-time camera transformations and accurate explicit pose readouts. Perhaps surprisingly, RUST achieves similar quality as methods which have access to perfect camera pose, thereby unlocking the potential for large-scale training of amortized neural scene representations.Comment: CVPR 2023 Highlight. Project website: https://rust-paper.github.io

arXiv.org e-Print Archive

Object-Centric Learning with Slot Attention

Author: Dosovitskiy Alexey
Heigold Georg
Kipf Thomas
Locatello Francesco
Mahendran Aravindh
Unterthiner Thomas
Uszkoreit Jakob
Weissenborn Dirk
Publication venue
Publication date: 26/06/2020
Field of study

Learning object-centric representations of complex scenes is a promising step towards enabling efficient abstract reasoning from low-level perceptual features. Yet, most deep learning approaches learn distributed representations that do not capture the compositional properties of natural scenes. In this paper, we present the Slot Attention module, an architectural component that interfaces with perceptual representations such as the output of a convolutional neural network and produces a set of task-dependent abstract representations which we call slots. These slots are exchangeable and can bind to any object in the input by specializing through a competitive procedure over multiple rounds of attention. We empirically demonstrate that Slot Attention can extract object-centric representations that enable generalization to unseen compositions when trained on unsupervised object discovery and supervised property prediction tasks

arXiv.org e-Print Archive

MPG.PuRe

Simple Open-Vocabulary Object Detection with Vision Transformers

Author: Arnab Anurag
Dehghani Mostafa
Dosovitskiy Alexey
Gritsenko Alexey
Houlsby Neil
Kipf Thomas
Mahendran Aravindh
Minderer Matthias
Neumann Maxim
Shen Zhuoran
Stone Austin
Wang Xiao
Weissenborn Dirk
Zhai Xiaohua
Publication venue
Publication date: 20/07/2022
Field of study

Combining simple architectures with large-scale pre-training has led to massive improvements in image classification. For object detection, pre-training and scaling approaches are less well established, especially in the long-tailed and open-vocabulary setting, where training data is relatively scarce. In this paper, we propose a strong recipe for transferring image-text models to open-vocabulary object detection. We use a standard Vision Transformer architecture with minimal modifications, contrastive image-text pre-training, and end-to-end detection fine-tuning. Our analysis of the scaling properties of this setup shows that increasing image-level pre-training and model size yield consistent improvements on the downstream detection task. We provide the adaptation strategies and regularizations needed to attain very strong performance on zero-shot text-conditioned and one-shot image-conditioned object detection. Code and models are available on GitHub.Comment: ECCV 2022 camera-ready versio

arXiv.org e-Print Archive

Self-supervised learning using motion and visualizing convolutional neural networks

Author: Aravindh Mahendran
Publication venue
Publication date: 01/01/2018
Field of study

We propose a novel method for learning convolutional image representations without manual supervision. We use motion in the form of optical-flow, to supervise representations of static images. Training a network to predict flow from a single image can be needlessly difficult due to intrinsic ambiguities in this prediction task. We instead propose two simpler learning goals: (a) embed pixels such that the similarity between their embeddings matches that between their optical-flow vectors (CPFS), or (b) segment the image such that optical-flow within segments constitutes coherent motion (S3-CNN). At test time, the learned deep network can be used without access to video or flow information and transferred to various computer vision tasks such as image classification, detection, and segmentation. Our CPFS model achieves state-of-the-art results in self-supervision using motion cues, as demonstrated on standard transfer learning benchmarks. Despite high transfer learning performance, we feel the need to visualize the representation learned by our self-supervised CPFS model. With that motivation we develop a suite of visualization methods and study several landmark representations, both shallow and deep. These visualizations are based on the concept of ânatural pre-imageâ, that is a natural-looking image whose representation has some notable property. We study three such visualizations: inversion, in which the aim is to reconstruct an image from its representation, activation maximization, in which we search for patterns that maximally stimulate a representation component, and caricaturization, in which the visual patterns that a representation detects in an image are exaggerated. We formulate these into a regularized energy-minimization framework and demonstrate its effectiveness. We show that our method can invert HOG features more accurately than recent alternatives while being applicable to CNNs too. We apply these visualization techniques to our self-supervised CPFS model and contrast it with visualizations of a fully supervised AlexNet and a randomly initialized one.</p

Oxford University Research Archive

Self-supervised learning using motion and visualizing convolutional neural networks

Author: Aravindh Mahendran
Publication venue
Publication date: 01/01/2018
Field of study

We propose a novel method for learning convolutional image representations without manual supervision. We use motion in the form of optical-flow, to supervise representations of static images. Training a network to predict flow from a single image can be needlessly difficult due to intrinsic ambiguities in this prediction task. We instead propose two simpler learning goals: (a) embed pixels such that the similarity between their embeddings matches that between their optical-flow vectors (CPFS), or (b) segment the image such that optical-flow within segments constitutes coherent motion (S3-CNN). At test time, the learned deep network can be used without access to video or flow information and transferred to various computer vision tasks such as image classification, detection, and segmentation. Our CPFS model achieves state-of-the-art results in self-supervision using motion cues, as demonstrated on standard transfer learning benchmarks. Despite high transfer learning performance, we feel the need to visualize the representation learned by our self-supervised CPFS model. With that motivation we develop a suite of visualization methods and study several landmark representations, both shallow and deep. These visualizations are based on the concept of “natural pre-image”, that is a natural-looking image whose representation has some notable property. We study three such visualizations: inversion, in which the aim is to reconstruct an image from its representation, activation maximization, in which we search for patterns that maximally stimulate a representation component, and caricaturization, in which the visual patterns that a representation detects in an image are exaggerated. We formulate these into a regularized energy-minimization framework and demonstrate its effectiveness. We show that our method can invert HOG features more accurately than recent alternatives while being applicable to CNNs too. We apply these visualization techniques to our self-supervised CPFS model and contrast it with visualizations of a fully supervised AlexNet and a randomly initialized one.</p

Oxford University Research Archive